This document contains results for testing the number of samplings for consensus partitioning on the two datasets ( TCGA GBM microarray dataset and HSMM single cell RNASeq dataset). The numbers of random samplings were tested for 25, 50, 100 and 200. We tested both row sampling and column sampling. For each combination of parameters, cola ran for 100 times. The scripts for the analysis can be found here.
For each dataset, there are four plots:
Figure 1A. Variability of 1-PAC scores from the 100 cola runs. Consensus partitionings were applied by row sampling.
Figure 1B. Variability of 1-PAC scores from the 100 cola runs. Consensus partitionings were applied by column sampling.
Figure 1C. Variability of mean silhouette scores from the 100 cola runs. Consensus partitionings were applied by row sampling.
Figure 1D. Variability of mean silhouette scores from the 100 cola runs. Consensus partitionings were applied by column sampling.
Figure 1E. Variability of concordance scores from the 100 cola runs. Consensus partitionings were applied by row sampling.
Figure 1F. Variability of concordance scores from the 100 cola runs. Consensus partitionings were applied by column sampling.
Figure 2A. Mean concordance in the 100 cola runs. Consensus partitionings were applied by row sampling.
Figure 2B. Mean concordance in the 100 cola runs. Consensus partitionings were applied by column sampling.
Figure 3A. Mean concordance of consensus partitioning with 25 and 200 samplings. Consensus partitionings were applied by row sampling.
Figure 3B. Mean concordance of consensus partitioning with 25 and 200 samplings. Consensus partitionings were applied by column sampling.
Figure 4A. Relations between mean 1-PAC from 25/200 samplings and concordance. Consensus partitionings were applied by row sampling.
Figure 4B. Relations between mean 1-PAC from 25/200 samplings and concordance. Consensus partitionings were applied by column sampling.
Figure 5A. Variability of 1-PAC scores from the 100 cola runs. Consensus partitionings were applied by row sampling.
Figure 5B. Variability of 1-PAC scores from the 100 cola runs. Consensus partitionings were applied by column sampling.
Figure 5C. Variability of mean silhouette scores from the 100 cola runs. Consensus partitionings were applied by row sampling.
Figure 5D. Variability of mean silhouette scores from the 100 cola runs. Consensus partitionings were applied by column sampling.
Figure 5E. Variability of concordance scores from the 100 cola runs. Consensus partitionings were applied by row sampling.
Figure 5F. Variability of concordance scores from the 100 cola runs. Consensus partitionings were applied by row sampling.
Figure 6A. Mean concordance in the 100 cola runs. Consensus partitionings were applied by row sampling.
Figure 6B. Mean concordance in the 100 cola runs. Consensus partitionings were applied by column sampling.
Figure 7A. Mean concordance of consensus partitioning with 25 and 200 samplings. Consensus partitionings were applied by row sampling.
Figure 7B. Mean concordance of consensus partitioning with 25 and 200 samplings. Consensus partitionings were applied by column sampling.
Figure 8A. Relations between mean 1-PAC from 25/200 samplings and concordance. Consensus partitionings were applied by row sampling.
Figure 8B. Relations between mean 1-PAC from 25/200 samplings and concordance. Consensus partitionings were applied by column sampling.